Our Re f.: 76-105 


U. S. PA TENT APPLICA TION 


Inventor(s): Mark GIJZEN 


Invention: SEED COAT SPECIFIC DNA REGULATORY REGION AND 

PEROXIDASE 


NIXON & VANDERHYE RC. 

ATTORNEYS AT LAW 
1100 NORTH GLEBE ROAD 
8™ FLOOR 
ARLINGTON VIRGINIA 22201-4714 
(703) 816-4000 
Facsimile (703) 816-4100 


SPECIFICA TION 


Seed coat specific DNA regulatory region and peroxidase 
The present invention relates to a novel DNA molecule comprising a plant seed 
coat specific DNA regulatory region and a novel structural gene encoding a peroxidase. 
The seed-coat specific DNA regulatory region may also be used to control the 
expression of other genes of interest v^ithin the seed coat. 

BACKGROUND OF THE INVENTION 

Full citations for references appear at the end of the Examples section. 

Peroxidases are enzymes catalyzing oxidative reactions that use H2O2 as an 
electron acceptor. These enzymes are widespread and occur ubiquitously in plants as 
isozymes that may be distinguished by their isoelectric points. Plant peroxidases 
contribute to the structural integrity of cell walls by functioning in lignin biosynthesis 
and suberization, and by forming covalent cross-linkages between extension, cellulose, 
pectin and other cell wall constituents (Campa, 1991). Peroxidases are also associated 
with plant defence responses and resistance to pathogens (Bowles, 1990; 
Moerschbacher 1992). Soybeans contain 3 anionic isozymes of peroxidase with a 
minimum M, of 37 kDa (Sessa and Anderson, 1981). Recently one peroxidase 
isozyme, localised within the seed coat of soybean, has been characterized with a M, 
of 37 kDa (Gillikin and Graham, 1991). 


- 2 - 

In an analysis of soybean seeds, Buttery and Buzzell (1968) showed that the 
amount of peroxidase activity present in seed coats may vary substantially among 
different cultivars. The presence of a single dominant gene Ep causes a high seed coat 
peroxidase phenotype (Buzzell and Buttery, 1969). Homozygous recessive epep plants 
are ^100-fold lower in seed coat peroxidase activity. This results from a reduction in 
5 the amount of peroxidase enzyme present, primarily in the hourglass cells of the 
subepidermis (Gijzen et aL, 1993). In plants carrying the Ep gene, peroxidase is 
heavily concentrated in the hourglass cells (osteosclereids). These cells form a highly 
differentiated cell layer with thick, elongated secondary walls and large intercellular 
spaces (Baker et aL, 1987). Hourglass cells develop between the epidermal 

10 macrosclereids and the underlying articulated parenchyma, and are a prominent feature 
of seed coat anatomy at full maturity. The cytoplasm exudes from the hourglass cells 
upon imbibition with water and a distinct peroxidase isozyme constitutes five to 10% 
of the total soluble protein in EpEp seed coats . It is not known why the hourglass cells 
accumulate large amounts of peroxidase, but the sheer abundance and relative purity 

1 5 of the enzyme in soybean seed coats is significant because peroxidases are versatile 
enzymes with many commercial and industrial applications. Studies of soybean seed 
coat peroxidase have shown this enzyme to have useful catalytic properties and a high 
degree of thermal stability even at extremes of pH (McEldoon et aL , 1995). These 
properties result in the preferred use of soybean peroxidase, over that of horseradish 

20 peroxidase, in diagnostic assays as an enzyme label for antigens, antibodies, 
oligonucleotide probes, and within staining techniques. Johnson et al report on the use 
of soybean peroxidase for the deinking of printed waste paper (U.S. 5,270,770; 
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December 6, 1994) and for the biocatalytic oxidation of primary alcohols (U.S. 
5,391,488; February 13, 1996). Soybean peroxidase has also been used as a 
replacement for chlorine in the pulp and paper industry, or as formaldehyde 
replacement (Freiberg, 1995). 

5 An anionic soybean peroxidase from seed coats has been purified (Gillikin and 

Graham, 1991). This protein has a pi of 4.1 and M, of 37 kDa. A method for the 
bulk extraction of peroxidase from seed hulls of soybean using a freeze thaw technique 
has also been reported (U.S. 5,491,085, February 13, 1996, Pokara and Johnson). 

10 Lagrimini et al (1987) disclose the cloning of a ubiquitous anionic peroxidase 

in tobacco encoding a protein of M, of 36 kDa. This peroxidase has also been over 
expressed in transgenic tobacco plants (Lagrimini et al 1990) and Maliyakal discloses 
the expression of this gene in cotton (WO 95/08914). 

1 5 Huangpu et al (1995) reported the partial cloning of a soybean anionic seed coat 

peroxidase. The 1031 bp sequence contained an open reading frame of 849 bp 
encoding a 283 amino acid protein with a Mr of 30,577. The M, of this peroxidase is 
7 kDa less than what one would expect for a soybean seed coat peroxidase as reported 
by Gillikin and Graham (1991) and possibly represents another peroxidase isozyme 

20 within the seed coat. 
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The upstream promoter sequences for two poplar peroxidases have been 
described by Osakabe et al (1995). A number of characteristic regulatory sites were 
identified from comparison of these sequences to existing promoter elements. 
Additionally, a cryptic promoter with apparent specificity for seed coat tissues was 
isolated from tobacco by a promoter trapping strategy (Fobert et al. 1994). The 
upstream regulatory sequences associated with the Ep gene in soybean are distinct from 
these and other previously characterized promoters. The soybean Ep promoter drives 
high-level expression in a cell and tissue specific manner. The peroxidase protein 
encoded by the Ep gene accumulates in the seed coat tissues, especially in the hour 
lass cells of the subepidermis . Minimal expression of the gene is detected in root 


g 


10 tissues. 


One problem arising from the desired use of soybean seed coat peroxidase is 
that there is variability between soybean varieties regarding peroxidase production 
(Buttery and Buzzell, 1986; Freiberg, 1995). Due to the commercial interest in the use 
1 5 of soybean seed coat peroxidase new methods of producing this enzyme are required. 
Therefore, the gene responsible for the expression of the 37 kDa isozyme in soybean 
seed coat was isolated and characterized. 


20 


Furthermore, novel regulatory regions obtained from the genomic DNA c 
soybean seed coat peroxidase have been isolated and characterized and are useful i 
directing the expression of genes of interest in seed coat tissues. 


SUMMARY OF THE INVENTION 

The present invention relates to a DNA molecule that encodes a soybean seed 
coat peroxidase and associated DNA regulatory regions. 


This invention also embraces isolated DNA molecules comprising the nucleotide 
sequence of either SEQ ID N0:1 (the cDNA encoding soybean seed coat peroxidase) 
SEQ ID No: 2 (the genomic sequence). 

This invention also provides for a chimeric DNA molecule comprising a seed 
coat-specific regulatory region having nucleotides 1-1532 of SEQ ID NO:2 and a gene 
of interest under control of this DNA regulatory region. Also included within this 
invention are chimeric DNA molecules comprising genomic DNA sequences 


Furthermore, this invention is directed to isolated DNA molecules comprising at least 
1 ) 24 contiguous nucleotides selected from nucleotides 1-1532 of SEQ ID 


ex 



NO:2; 


2) 


32 contiguous nucleotides selected from nucleotides 442-1041 of SEQ 


3) 



ID NO:2; or 


4) 


22 contiguous nucleotides selected from nucleotides 2430---26^ of SEQ 


ID NO:2. 
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The present invention also provides for vectors which comprise DNA molecules 
encoding soybean seed coat peroxidase. Such a construct may include the DNA 
regulatory region from SEQ ID NO:2, including nucleotides 1-1532, or at least 24 
contiguous nucleotides selected from nucleotides 1-1532 of SEQ ID NO: 2 in 
conjunction with the seed coat peroxidase gene, or the seed coat peroxidase gene under 
5 the control of any suitable constitutive or inducible promoter of interest. 

This invention is also directed towards vectors which comprise a gene of 
interest placed under the control of a DNA regulatory element derived from the 
genomic sequence encoding soybean seed coat peroxidase. Such a regulatory element 
10 includes nucleotides 1-1532 of SEQ ID NO:2, or at least 24 contiguous nucleotides 
selected from nucleotides 1-1532 of SEQ ID NO:2. Elements comprising nucleotides 
4i^iQ4i, l&feSs or 2430-2^ of SEQ ID N0:2, or 32 contiguous nucleotides 

A 

selected from nucleotides 412-1041 of SEQ ID NO:2, 23 contiguous nucleotides 

selected from nucleotides t294=2263 of SEQ ID NO: 2, or 22 contiguous nucleotides 

/ [if -"-La ' ; a 

15 selected from nucleotides 3430-2691 of SEQ ID NO:2 may also be used. 

This invention also embraces prokaryotic and eukaryotic cells comprising the 
vectors identified above. Such cells may include bacterial, insect, mammalian, and 
plant cell cultures. 

20 

This invention also provides for transgenic plants comprising the seed coat 
peroxidase gene under control of constitutive or inducible promoters. Furthermore, 
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this invention also relates to transgenic plants comprising the DNA regulatory regions 
of nucleotides 1-1532 of SEQ ID NO: 2 controlling a gene of interest, or comprising 
genes of interest in functional association with genomic DNA sequences exemplified 


K by nucleotides *12-104i , 1234-2263 or 2430-2691 of SEQ ID NO:2. Also embraced 

by this invention are transgenic plants having regulatory regions comprising at least 24 
5 contiguous nucleotides selected from nucleotides 1-1532 of SEQ ID NO:2, 32 
I contiguous nucleotides selected from nucleotides 412-404i of SEQ ID NO:2, 23 

contiguous nucleotides selected from nucleotides 1234-2263 of SEQ ID NO:2, or 22 
contiguous nucleotides selected from nucleotides 2430-26W of SEQ ID NO:2. 


1 0 This invention is also directed to a method for the production of soybean seed 

coat peroxidase in a host cell comprising: 

i) transforming the host cell with a vector comprising an oligonucleotide 
sequence that encodes soybean seed coat peroxidase; and 

ii) culturing the host cell under conditions to allow expression of the 
15 soybean seed coat peroxidase. 


This invention also provides for a process for producing a heterologous gene 
of interest within seed coats of a transformed plant, comprising propagating a plant 
transformed with a vector comprising a gene of interest under the control of 
20 nucleotides 1-1532 of SEQ ID NO:2. Furthermore, this invention embraces a process 
for producing a heterologous gene of interest within seed coats of a transformed plant, 
comprising propagating a plant transformed with a vector comprising a gene of interest 


- 8 - 

under the control of a regulatory region comprising at least 24 nucleotides selected 
from nucleotides 1-1532 of SEQ ID NO:2. 


Although the present invention is exemplified by a soybean seed coat peroxidase 
and adjacent DNA regulatory regions, in practice any gene of interest can be placed 
5 downstream from the DNA regulatory region for seed coat specific expression. 


BRIEF DESCRIPTION OF THE DRAWINGS 

These and other features of the invention will become more apparent from the 
following description in which reference is made to the appended drawings wherein: 

Figure 1 is the cDNA and deduced amino acid sequence of soybean seed coat 
peroxidase. Nucleotides are numbered by assigning +1 to the first base of the 
ATG start codon; amino acids are numbered by assigning + 1 to the N-terminal 
Gin residue after cleavage of the putative signal sequence. The N-terminal 
signal sequence, the region of the active site, and the heme-binding domain are 
underlined. The numerals I, II and III placed directly above single nucleotide 
gaps in the sequence indicate the three intron splice positions. The target site 
and direction of five different PGR primers are shown with dotted lines above 
the nucleotide sequence. An asterisk (*) marks the translation stop codon. 

Figure 2 is the genomic DNA sequence of the Soybean seed coat peroxidase. 

Figure 3 is a comparison of soybean seed coat peroxidase with other closely related 
plant peroxidases. The GenBank accession numbers are provided next to the 
name of the plant from which the peroxidase was isolated. The accession 
number for the soybean sequence is L78163. (A) A comparison of the nucleic 
acid sequences; (B) A comparison of the amino acid sequences. 
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Figure 4 is a restriction fragment length polymorphisms between EpEp and epep 
genotypes using the seed coat peroxidase cDNA as probe. Genomic DNA of 
soybean lines OX312 {epep) and OX347 {EpEp) was digested with restriction 
, separated by electrophoresis in a 0.5% agarose gel, transferred to 


enzyme 


nylon, and hybridized with 32p_iabelled cDNA encoding the seed coat 
peroxidase. The size of the hybridizing fragments was estimated by comparison 
to standards and is indicated on the right. 


Figure 5 exhibits the strucmre of the Ep Locus. A 17 kb fragment including the Ep 
locus is illustrated schematically. A 3.3 kb portion of the gene is enlarged and 
exons and introns are represented by shaded and open boxes, respectively. The 
final enlargement of the 5 ' region shows the location and DNA sequence 
around the 87 bp deletion occurring in the ep allele of soybean line OX312. 
Nucleotides are numbered by assigning + 1 to the first base of the ATG start 
codon. 


Figure 6 displays PGR analysis of EpEp and epep genotypes using primers derived 
from the seed coat peroxidase cDNA. Genomic DNA from soybean lines 
OX312 {epep) and OX347 {EpEp) was used as template for PGR analysis with 
four different primer sets. Amplification products were separated by 
electrophoresis through a 0.8% agarose gel and visualized under UV light after 
staining with ethidium bromide. Genotype and primer combinations are 
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indicated at the top of the figure. The size in base pairs of the amplified 
fragments are indicated on the right. 


Figure 7 exhibits PGR analysis of an F2 population from a cross of EpEp and epep 
genotypes. Genomic DNA was used as template for PGR analysis of the 
parents (P) and 30 individuals. The cross was derived from the soybean lines 
0X312 {epep) and 0X347 {EpEp). Plants were self pollinated and seeds were 
collected and scored for seed coat peroxidase activity. The symbols (-) and (+) 
indicate low and high seed coat peroxidase activity, respectively. Primers 
prx9+ and prxlO- were used in the amplification reactions. Products were 
separated by electrophoresis through a 0.8% agarose gel and visualized under 
UV light after staining with ethidium bromide. The migration of molecular 
markers and their corresponding size in kb is also shown (lanes M). 


Figure 8 displays PGR analysis of six different soybean cultivars with primers derived 
from the seed coat peroxidase cDNA sequence. Genomic DNA was used as 
template for PGR analysis of three EpEp cultivars and three epep cultivars. 
Primers used in the amplification reactions and the size of the DNA product is 
indicated on the left. Products were separated by electrophoresis through a 
0.8% agarose gel and visualized under UV light after staining with ethidium 


bromide . 


(A) Forward and reverse primers are downstream from deletion 

(B) Forward primer anneals to site within deletion 
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10 


(C) Primers span deletion 


Figure 9 shows the accumulation of peroxidase RNA in tissues of GEp and epep 

plants. Figure 9(A): A comparison of peroxidase transcript abundance 
in cultivars Harosoy 63 (Ep) or Marathon (ep). Seed and pod tissues 
were sampled at a late stage of development corresponding to a whole 
seed fresh weight of 250 mg. Root and leaf tissue was from six week 
old plants. Autoradiograph exposed for 96 h. Figure 9(B): 
Developmental expression of peroxidase in cultivar Harosoy 63 (Ep). 
Flowers were sampled immediately after opening. Seed coat tissues 
were sampled at four stages of development corresponding to a whole 
seed fresh weight of: lane 1, 50 mg; lane 2, 100 mg; lane 3, 200 mg; 
lane 4, 250 mg. Autoradiograph exposed for 20 h. 
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DESCRIPTION OF PREFERRED EMBODIMENT 

The present invention is directed to a novel oligonucleotide sequence encoding 


a 


seed coat peroxidase and associated DNA regulatory regions. 


According to the present invention DNA sequences that are "substantially 
homologous" includes sequences that are identified under conditions of high 
stringency. "High stringency" refers to Southern hybridization conditions employing 
washes at 65 °C with 0.1 x SSC, 0.5 % SDS. 


By "DNA regulatory region" it is meant any region within a genomic sequence 
that has the property of controlling the expression of a DNA sequence that is operably 
linked with the regulatory region. Such regulatory regions may include promoter or 
enhancer regions, and other regulatory elements recognized by one of skill in the art. 
A segment of the DNA regulatory region is exemplified in this invention, however, as 
is understood by one of skill in the art, this region may be used as a probe to identify 
surrounding regions involved in the regulation of adjacent DNA, and such surrounding 
regions are also included within the scope of this invention. 


In the context of this disclosure, the term "promoter" or "promoter region- 
refers to a sequence of DNA, usually upstream (5') to the coding sequence of a 
structural gene, which controls the expression of the coding region by providing the 
recognition for RNA polymerase and/or other factors required for transcription to start 
at the correct site. 
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There are generally two types of promoters, inducible and constitutive. An 
"inducible promoter" is a promoter that is capable of directly or indirectly activating 
transcription of one or more DNA sequences or genes in response to an inducer. In 
the absence of an inducer the DNA sequences or genes will not be transcribed. 
Typically the protein factor, that binds specifically to an inducible promoter to activate 

5 transcription, is present in an inactive form which is then directly or indirectly 
converted to the active form by the inducer. The inducer can be a chemical agent such 
as a protein, metabolite, growth regulator, herbicide or phenolic compound or a 
physiological stress imposed directly by heat, cold, salt, or toxic elements or indirectly 
through the action of a pathogen or disease agent such as a virus, A plant cell 

1 0 containing an inducible promoter may be exposed to an inducer by externally applying 
the inducer to the cell or plant such as by spraying, watering, heating or similar 
methods. 

By "constitutive promoter" it is meant a promoter that directs the expression 
15 of a gene throughout the various parts of a plant and continuously throughout plant 
development. Examples of known constitutive promoters include those associated with 
the CaMV 35S transcript and Agrobacterium Ti plasmid nopaline synthase gene. 

The chimeric gene constructs of the present invention can further comprise a 
20 3' untranslated region. A 3' untranslated region refers to that portion of a gene 
comprising a DNA segment that contains a polyadenylation signal and any other 
regulatory signals capable of effecting mRNA processing or gene expression. The 
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polyadenylation signal is usually characterized by effecting the addition of polyadenylic 
acid tracks to the 3' end of the mRNA precursor. Polyadenylation signals are 
commonly recognized by the presence of homology to the canonical form 5' AATAAA- 
3' although variations are not uncommon. 


5 Examples of suitable 3' regions are the 3' transcribed non-translated regions 

containing a polyadenylation signal of Agrobacterium tumour inducing (Ti) plasmid 
genes, such as the nopaline synthase {Nos gene) and plant genes such as the soybean 
storage protein genes and the small subunit of the ribulose-1, 5-bisphosphate 
carboxylase (ssRUBISCO) gene. The 3' untranslated region from the structural gene 

10 of the present construct can therefore be used to construct chimeric genes for 
expression in plants . 


The chimeric gene construct of the present invention can also include further 
enhancers, either translation or transcription enhancers, as may be required. These 

1 5 enhancer regions are well known to persons skilled in the art, and can include the ATG 
initiation codon and adjacent sequences. The initiation codon must be in phase with 
the reading frame of the coding sequence to ensure translation of the entire sequence. 
The translation control signals and initiation codons can be from a variety of origins, 
both namral and synthetic. Translational initiation regions may be provided from the 

20 source of the transcriptional initiation region, or from the structural gene. The 
sequence can also be derived from the promoter selected to express the gene, and can 
be specifically modified so as to increase translation of the mRNA. 
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To aid in identification of transformed plant cells, the constructs of this 
invention may be ftirther manipulated to include plant selectable markers. Useful 
selectable markers include enzymes which provide for resistance to an antibiotic such 
as gentamycin, hygromycin, kanamycin, and the like. Similarly, enzymes providing 
for production of a compound identifiable by colour change such as GUS 
5 (P-glucuronidase), or luminescence, such as luciferase are useful. 


Also considered part of this invention are transgenic plants containing the 
chimeric gene construct of the present invention. Methods of regenerating whole 
plants from plant cells are known in the art, and the method of obtaining transformed 

10 and regenerated plants is not critical to this invention. In general, transformed plant 
cells are cultured in an appropriate medium, which may contain selective agents such 
as antibiotics, where selectable markers are used to facilitate identification of 
transformed plant cells. Once callus forms, shoot formation can be encouraged by 
employing the appropriate plant hormones in accordance with known methods and the 

15 shoots transferred to rooting medium for regeneration of plants. The plants may then 
be used to establish repetitive generations, either from seeds or using vegetative 
propagation techniques . 


The constructs of the present invention can be introduced into plant cells using 
20 Ti plasmids, Ri plasmids, plant virus vectors, direct DNA transformation, micro- 
injection, electroporation, etc. For reviews of such techniques see for example 
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Weissbach and Weissbach (1988) and Geierson and Corey (1988). The present 
invention further includes a suitable vector comprising the chimeric gene construct. 

Buttery and Buzzell (1968) showed that the amount of peroxidase activity 
present in seed coats may vary substantially among different cultivars. The presence 

5 of a single dominant gene Ep causes a high seed coat peroxidase phenotype (Buzzell 
and Buttery, 1969). Homozygous recessive epep plants are -100-fold lower in seed 
coat peroxidase activity. This results from a reduction in the amount of peroxidase 
enzyme present, primarily in the hourglass cells of the subepidermis (Gijzen et aL, 
1993). In plants carrying the Ep gene, peroxidase is heavily concentrated in the 

10 hourglass cells (osteosclereids). These cells form a highly differentiated cell layer with 
thick, elongated secondary walls and large intercellular spaces (Baker et aL, 1987). 

Screening a seed coat cDNA library prepared from EpEp plants with a 
degenerate primer derived from the active site domain of plant peroxidase resulted in 
15 a high frequency of positive clones. Many of these clones encode identical cDNA 
molecules and indicate that the corresponding mRNA is an abundant transcript in 
developing seed coat tissues. The sequence of the cDNA is shown in Figure 1. 

Previous studies on soybean seed coat peroxidase indicated that this enzyme is 
20 heavily glycosylated and that carbohydrate contributes 18% of the mass of the apo- 
enzyme (Gray et al., 1996). The seven potential glycosylation sites identified from the 
amino acid sequence of the seed cost peroxidase (Figure 1) would accommodate the 
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five or six N-linked glycosylation sites proposed by Gray et al. (1996). The heme- 
binding domain encompasses residues Aspl61 to PhelTl and the acid-base catalysis 
region from Gly33 to Cys44. The two regions are highly conserved among plant 
peroxidases and are centred around ftmctional histidine residues, His 169 and His40. 
There are eight conserved cysteine residues in the mature protein that provide for four 
di-sulfide bridges found in other plant peroxidases and predicted from the crystal 
structure of peanut peroxidase (Welinder, 1992; SchuUer et al., 1996). Other 
conserved areas include residues Cys91 to AlalOS and Vail 19 to Leul27 that occur in 


or 


around helix D. The most divergent aspects of the seed coat peroxidase protein 


sequence are the carboxy- and amino-terminal regions. These sequences probably 
10 provide special targeting signals for the proper processing and delivery of the peptide 
chain. It is possible the carboxy-terminal extension of the seed coat peroxidase is 
removed at maturity, as has been shown for certain barley and horseradish peroxidases 
(Welinder, 1992). 


1 5 The molecular mass of the enzyme has been determined by denaturing gel 

electrophoresis to be 37 kDa (Sessa and Anderson, 1981; Gillikin and Graham, 1991) 
or 43 kDa (Gijzen et al, 1993). Analysis by mass spectrometry indicated a mass of 
40,622 Da for the apo-enzyme and 33,250 Da after deglycosylation (Gray et al., 
1996). These values are in good agreement with the mass of 35,377 Da calculated from 

20 the predicted amino acid sequence for the mature apo-protein prior to glycosylation and 
other modifications. Huangpu et al (1995) reported an anionic seed coat peroxidase 
having a M, of 30,577 Da and characterized a partial cDNA encoding this protein. 
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This 1031 bp cDNA contained an open reading frame of 849 bp encoding a 283 amino 
acid protein. There are several differences between this reported sequence and the 
sequence of this invention that are manifest at the amino acid level (see Figure 3 for 
sequence comparison). The enzyme encoded by the gene reported by Huangpu et al 
is different from that of this invention as the peroxidase of this invention has a M, of 


35,377 Da. 


Genomic DNA blots probed with the seed coat peroxidase cDNA produced two 
or three hybridizing fragments of varying intensity with most restriction enzyme 
digestions, despite that several peroxidase isozymes are present in soybean. The results 
10 indicate that this seed coat peroxidase is present as a single gene that does not share 
sufficient homology with most other peroxidase genes to anneal under conditions of 
high stringency. 


The genomic DNA sequence comprises four exons spanning bp 1533-1752 
1 5 (exon I), 2383 -2574 (exon 2), 3605-3769 (exon 3) and 4033-45 16 (exon 4) and three 
introns comprising 1752-2382 (intron 1), 2575-3604 (intron 2) and 3770-4^1^ (intron 
3), of SEQ ID NO:2. Feamres of the upstream regulatory region of the genomic DNA 
include a TATA box centred on bp 1487; a cap signal 32 bp down stream centred on 
bp 1520. Also noted within the genomic sequence are three polyadenylation signals 
20 centred on bp 4520, 4598, 4663 and a polyadenylation site at bp 4700. 
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This promoter is considered seed coat specific since the peroxidase protein 
encoded by the Ep gene accumulates in the seed coat tissues, especially in the hourglass 
cells of the subepidermis, and is not expressed in other tissues, aside from a marginal 
expression of peroxidase in the root tissues. This is also true at the transcriptional 
level (see Figure 9). The DNA regulatory regions of the genomic sequence of Figure 
5 2 are used to control the expression of the adjacent peroxidase gene in seed coat tissue. 
Such regulatory regions include nucleotides 1-1532. Other regions of interest include 
nucleotides 1752-2382, 2575-3604 and/or 3770-4032 of SEQ ID NO:2. Therefore 
other proteins of interest may be expressed in seed coat tissues by placing a gene 
capable of expressing the protein of interest under the control of the DNA regulatory 

10 elements of this invention. Genes of interest include but are not restricted to herbicide 
resistant genes, genes encoding viral coat proteins, or genes encoding proteins 
conferring biological control of pest or pathogens such as an insecticidal protein for 
example B, thuringiensis toxin. Other genes include those capable of the production 
of proteins that alter the taste of the seed and/or that affect the nutritive value of the 

15 soybean. 

A modified DNA regulatory sequence may be obtained by introducing changes 
into the natural sequence. Such modifications can be done through techniques known 
to one of skill in the art such as site-directed mutagenesis, reducing the length of the 
20 regulatory region using endonucleases or exonucleases, increasing the length through 
the insertion of linkers or other sequences of interest. Reducing the size of DNA 
regulatory region may be achieved by removing 3' or 5' regions of the regulatory 
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region 


of the natural sequence by using a endonuclease such as BAL 3 1 (Sambrook et 
al 1989). However, any such DNA regulatory region must still function as a seed coat 
specific DNA regulatory region. 


10 


It may be readily determined if such modified DNA regulatory elements are 
capable of acting in a seed coat specific manner transforming plant cells with such 
regulatory elements controlling the expression of a suitable marker gene, culturing 
these plants and determining the expression of the marker gene within the seed coat as 
outlined above. One may also analyze the efficacy of DNA regulatory elements by 
introducing constructs comprising a DNA regulatory element of interest operably 
linked with an appropriate marker into seed coat tissues by using particle bombardment 
directed to seed coat tissue and determining the degree of expression of the regulatory 
region as is known to one of skill in the art. 


15 


20 


Two tandemly arranged genes encoding anionic peroxidase expressed in stems 
of Populus kitakamiensis , prxASa and prxA4a have been cloned and characterized 
(Osakabe et al, 1995). Both of these genomic sequences contained four exons and 
three introns and encoded proteins of 347 and 343 amino acids, respectively. The two 
genes encode distinct isozymes with deduced M,s of 33.9 and 34.6 kDa. 
Furthermore, a 532 bp promoter derived from the peroxidase gene of Armoracia 
rusticana has also been reported (Toyobo KK, JP 4,126,088, April 27, 1992). 
However, a search using GenBank revealed no substantial similarity between the 


promoter region, or 


introns 1, 2 and 3 of this invention and those within the literature 
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Digestion of the genomic DNA with BamUl or Sad revealed restriction 
fragment length polymorphisms that distinguished EpEp and epep genotypes. Although 
the Xbal digestion did not produce a readily detectable polymorphism, the size of the 
hybridizing fragment in both genotypes was -14 kb. Thus, a 0.3 kb size difference is 
outside of the resolving power of the separation for fragments this large. Sequence 
analysis of EpEp and epep genotypes indicates that the mutant ep allele is missing 87 
bp of sequence at the 5' end of the strucmral gene. This would account for the 
drastically reduced amounts of peroxidase enzyme present in seed coats of epep plants 
the deletion includes the translation start codon and the entire N-terminal signal 


smce 


sequence. However, the 87 bp deletion cannot account for the differences observed in 
10 the RFLP analysis since the missing fragment does not include a BamUl site and is 
much smaller than the 0.3 kb polymorphism detected in the Sad digestion. Thus, 
other genetic rearrangements must occur in the vicinity of the ep locus that lead to 
these polymorphisms. 


15 The results shown here indicate that the mutation causing low seed coat 

peroxidase activity occurs in the structural gene encoding the enzyme. This mutation 
is an 87 bp deletion in the 5' region of the gene encompassing the translation start site. 
Several different low peroxidase cultivars share a similar mutation in the same area, 
suggesting that the recessive ep alleles have a common origin or that the region is 

20 prone to spontaneous deletions or rearrangements. 
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Due to the industrial interest in soybean seed coat peroxidase, alternate sources 
for the production of this enzyme are needed. The DNA of this invention, encoding 
the seed coat soybean peroxidase under the control of a suitable promoter and 
expressed within a host of interest, can be used for the preparation of recombinant 
soybean seed coat peroxidase enzyme. 


Soybean seed coat peroxidase has been characterized as a lignin-type peroxidase 
that has industrially significant properties ie: high activity and stability under acidic 
conditions; exhibits wide substrate specificity; equivalent catalytic properties to that 
of Phanerochaete chrysosporium ligin peroxidase (the currently preferred enzyme used 
for treatment of industrial waste waters (Wick 1995) but is at least 150-fold more 

stable than horseradish peroxidase which is also used in industrial effluent 


stable; more 


treatments and medical diagnostic kits (McEldoon et al. , 1995). These properties are 
useful within industrial applications for the degradation of namral aromatic polymers 
including lignin and coal (McEldoon et al, 1995), and the preferred use of soybean 
peroxidase, over that of horseradish peroxidase, in medical diagnostic tests as an 


enzyme label for antigens, antibodies, oligonucleotide probes, and within staining 
techniques (Wick 1995). Soybean peroxidase is also used in the deinking of printed 
waste paper (Johnson et al., U.S. 5,270,770; December 6, 1994) and for the 
biocatalytic oxidation of primary alcohols (Johnson et al., U.S. 5,391,488; February 
13, 1996). Soybean peroxidase has also been used as a replacement for chlorine in 
the pulp and paper industry, in order to remove chlorine, phenolic or aromatic amine 
containing pollutants from industrial waste waters (Wick 1995), or as formaldehyde 
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replacement (Freiberg, 1995) for use in adhesives, abrasives, and protective coatings 
(e.g. varnish and resins, Wick 1995). 


Furthermore, the seed coat peroxidase gene may be expressed in an organ or 
tissue specific manner within a plant. For example, the quality and strength of cotton 
fibber can be improved through the over-expression of cotton or horseradish peroxidase 
placed under the control of a fibre-specific promoter (Maliyakal, WO 95/08914; April 


6, 1995) 


Similarly, seed-specific DNA regulatory regions of this invention may be used 
10 to control expression of genes of interest such as: 

i) genes encoding herbicide resistance, or 

ii) biological control of insects or pathogens (e.g. B. thuringiensis), or 

iii) viral coat proteins to protect against viral infections, or 

iv) proteins of commercial interest (e.g. pharmaceutical), and 

15 V) proteins that alter the nutritive value, taste, or processing of seeds 

within the seed coat of plants . 


While this invention is described in detail with particular reference to preferred 
embodiments thereof, said embodiments are offered to illustrate but not to limit the 
20 invention. 
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EXAMPLES 


Plant material 


All soybean (Glycine max [L.] Merr) cultivars and breeding lines were from the 
5 collection at Agriculture Canada, Harrow, Ontario. 


Seed Coat cDNA library Construction and Screening 


High schd coat peroxidase (EpEp) soybean cultivar Harosoy 63 plants were 
10 grown in field plots outdoors. Pods were harvested 35 days after flowering and seeds 
in the mid-to-late developmental stage were excised. The average fresh mass was 250 
mg per seed. Seed coats were dissected and immediately frozen in liquid nitrogen. The 
frozen tissue was lyophilized and total RNA extracted in 100 mM Tris-HCl pH 9.0, 
20 mM EDTA, 4% (w/v) sarkosyl, 200 mM NaCl, and 16 mM DTT, and precipitated 
15 with LiCl using the standard phenol/chloroform method described by Wang and 
Vodkin (1994). The poly (A) ^ RNA was purified on oligo(dT) cellulose columns prior 
to cDNA synthesis, size selection, ligation into the X ZAP Express vector, and 
packaging according to instructions (Stratagene). A degenerate oligonucleotide with the 
5- to 3' sequence of TT(C/T)CA(C/T)GA(C/T)TG(C/T)TT(C/T)GT was 5' end 
20 labelled to high specific activity and used as a probe to isolate peroxidase cDNA clones 
(Sambrook et al. , 1989). Duplicate plaque lifts were made to nylon filters (Amersham), 
UV fixed, and prehybridized at 36 °C for 3 h in 6 x SSC, 20 mM Na2HP04 (pH6.8), 
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5 X Denhardt's, 0.4 % SDS, and 500 Mg/mL salmon sperm DNA. Hybridization was 
in the same buffer, without Denhardt's, at 36 °C for 16 h. Filters were washed quickly 
with several changes of 6 x SSC ai|d 0. 1 % SDS, first at room temperature and finally 
at 40°C, prior to autoradiography for 16 h at -70°C with an intensifying screen. 

Genomic DNA Isolation, Library Construction, and DNA Blot Analysis 

Soybean genomic DNA was isolated from leaves of greenhouse grown plants 
or from etiolated seedlings grown in vermiculite. Plant tissue was frozen in liquid 
nitrogen and lyophilized before extraction and purification of DNA according to the 
method of Dellaporta et al. (1983). Restriction enzyme digestion of 30 ^J.g DNA, 
separation on 0.5 % agarose gels and blotting to nylon membranes followed standard 
protocols (Sambrook et al., 1989). For construction of the genomic library, DNA 
purified from Harosoy 63 leaf tissue was partially digested with BamUl and ligated into 
the X FIX II vector (Stratagene) . Gigapack XL packaging extract (Stratagene) was used 
to select for inserts of 9 to 22 kb. After library amplification, duplicate plaque lifts 
were hybridized to cDNA probe. 

Blots or filter lifts were prehybridized for 2 h at 65°C in 6 x SSC, 5 x 
Denhardt's, 0.5 % SDS, and 100 ^ug/mL salmon sperm DNA. Radiolabelled cDNA 
probe (20 to 50 ng) was prepared using the Ready-to-Go labelling kit (Pharmacia) and 
"P-dCTP (Amersham). Unincorporatecf^ P-dCTP was removed by spin column 
chromatography before adding radiolabelled cDNA to the hybridization buffer 


10 
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(identical to prehybridization buffer without Denhardt's). Hybridization was for 20 h 
at 65°C. Membranes were washed twice for 15 min at room temperature with 2 x SSC, 
0.5 % SDS, followed by two 30 min washes at 65°C with 0.1 x SSC, 0.5 % SDS. 
Autoradiography was for 20 h at -70°C using an intensifying screen and X-OMAT film 
(Kodak). 


DNA Sequencing 


Sequencing of DNA was performed using dye-labelled terminators and Taq-FS 
DNA polymerase (Perkin-Elmer). The PGR protocol consisted of 25 cycles of a 30 sec 
melt at 96°C, 15 sec annealing at 50°C, and 4 min extension at 60°C. Samples were 
analyzed on an Applied Biosystems 373 A Stretch automated DNA sequencer. 


Polymerase Chain Reaction 


15 PCR amplifications contained 1 ng template DNA, 5 pmol each primer, 1.5 

mM MgClz, 0.15 mM deoxy nucleotide triphosphates mix, 10 mM Tris-HCl, 50 mM 

\ i ^ 

^ ' KCl, pH 8.3, and 1 unit of Taq polymerase (Gibco BRL) in a total volume of 25 ^L. 

Reactions were performed in a Perkin-Elmer 480 thermal cycler. After an initial 2 min 

denaturation at 94°C, there were 35 cycles of 1 min denaturation at 94°C, 1 min 
20 annealing at 52°C, and 2 min extension at 72°C. A final 7 min extension at 72°C 

completed the program. The following primers were used for PCR analysis of genomic 

DNA: 
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prx2+ CTTCCAAATATCAACTCAAT 

prx6- TAAAGTTGGAAAAGAAAGTA 

prx9 ATGCATGCAGGTTTTTCAGT 

prxlO- TTGCTCGCTTTCTATTGTAT 

prxl2+ TCTTCGATGCTTCTTTCACC 

prx29 + C ATAAAC AATACGTACGTGAT 


RNA Isolation 


For isolation of RNA, tissue was harvested from greenhouse grown plants, 
1 0 dissected, frozen in liquid nitrogen, and lyophilized prior to extraction. Total RNA was 
purified from seed coats, embryos, pods, leaves, and flowers using standard 
phenol/chloroform method (Sambrook et al., 1989). This method did not afford good 
yields of RNA from roots, therefore this tissue was extracted with Triazole reagent 
(GibcoBRL) and total RNA purified according to manufacmrers ' instructions with an 
1 5 additional phenol-chloroform extraction step . The amount of RNA was estimated by 
measuring absorbance at 260 and 280 nm, and by electrophoretic separation in 
formaldehyde gels followed by staining with ethidium bromide and comparison to 
known standards. Total RNA (10 per sample) was prepared, subject to 
electrophoresis through a 1% agarose gel containing formaldehyde, and then stained 
20 with ethidium bromide to ensure equal loading of samples. The gel was blotted to 
nylon (Hybond™N, Amersham) according to standard methods and the RNA was fixed 
to the membrane by UV cross linking. 
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Seed Coat Peroxidase Assays 


The F3 seed was measured for peroxidase activity to score the phenotype of the 
population because the seed testa is derived from maternal tissue. The seeds were 
briefly soaked in water and the seed coat was dissected from the embryo and placed in 
a vial. Ten drops (-500 fxh) of 0.5% guaiacol was added and the sample was left to 
stand for 10 min before adding one drop (-50 /.L) of 0.1% H,0,. An immediate 
change in colour of the solution, from clear to red, indicates a positive result and high 
seed coat peroxidase activity . 


Example 1: The Seed Coat Peroxidase cDNA and genomic DNA sequences 


To isolate the seed coat peroxidase transcript, a cDNA library was constructed 
from developing seed coat tissue of the EpEp cultivar Harosoy 63. The primary 
library contained 10^ recombinant plaque forming units and was amplified prior to 
screening. A degenerate 17-mer oligonucleotide corresponding to the conserved active 
site domain of plant peroxidases was used to probe the library. In screening 10,000 
plaque forming units, 12 positive clones were identified. The cDNA insert size of the 
clones ranged from 0.5 to 2.5 kb, but six clones shared a common insert size of 1.3 
kb. These six clones (soyprx03, soyprx05, soyprx06, soyprxll. soyprxH, and 
soypr^l4) were chosen for further characterization since the 1.3 kb insert size matched 
the expected peroxidase transcript size. Sequence analysis of the six clones showed that 
they contained identical cDNA transcripts encoding a peroxidase and that each resulted 
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from an independent cloning event since the junction between the cloning vector and 
the transcript was different in all cases. 


Since it was not clear that the entire 5' end of the cDNA transcript was 
complete in any of the cDNA clones isolated, the structural gene corresponding to the 
seed coat peroxidase was isolated from a Harosoy 63 genomic library. A partial BamUl 
digest of genomic DNA was used to construct the library and more than 10^ plaque 
forming units were screened using the cDNA probe. A positive clone, G25-2-1-1-1, 
containing a 17 kb insert was identified and a 4.7 kb region encoding the peroxidase 
quenced SEQ ID NO:2. This region includes 1532 nucleotides of the 5 region 


was se 


1 0 of the peroxidase gene. 


The genomic sequence matched the cDNA sequence except for three introns 
encoded within the gene. The genomic sequence also revealed two additional 
translation start codons, beginning one bp and 10 bp upstream from the 5' end of the 
15 longest cDNA transcript isolated. Figure 1 shows the deduced cDNA sequence. The 
open reading frame of 1056 bp encodes a 352 amino acid protein of 38,106 Da. A 
heme-binding domain, a peroxidase active site signature sequence, and seven potential 
N-glycosylation sites were identified from the deduced amino acid sequence. The first 
26 amino acid residues conform to a membrane spanning domain. Cleavage of this 
20 putative signal sequence releases a mature protein of 326 residues with a mass of 
35,377 Da and an estimated pi of 4.4.. 
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10 


Relevant features of the genomic fragment (Figure 2) include four exons at bp 
192-411 (exon 1; 1533-1751 of SEQ ID NO:2), 1042 -1233 (exon 2; 2383-2574 of 
SEQ ID NO:2), 2263-2429 (exon 3; 4033-4516 fo SEQ ID NO:2) and 2692-3174 
(exon 4; 1752-2382 of SEQ ID NO:2) and three introns at bp 412-1041 (intron 1; 
1752-2382 of SEQ ID NO:2), 1234-2263 (intron 2; 2575-3604 of SEQ ID NO:2) and 
2430-2691 (intron 3; 3770-4032 of SEQ ID NO:2). The 1532 bp regulatory region of 
the genomic DNA include a TATA box centred on bp 1487 and a cap signal 32 bp 
down stream centred at bp 1520 of SEQ ID NO:2. Also noted within the genomic 
sequence are three polyadenylation signals centred on bp 4520, 4598, 4700 and a 
polyadenylation site at bp 4700 of SEQ ID NO:2. 


Figure 3 illVistrates the relationship between the soybean seed coat peroxidase 
and other selected plant peroxidases. The soybean sequence is most closely related to 

i 

^) ' four peroxidase cDNAs isolated from alfalfa, (see Figure 3) sharing from 65 to 67% 
identity at the amino acid level with the alfalfa proteins (X90693, X90694, X90692, 

15 el-Turk et al 1996; L36156, Abrahams et al 1994). When compared with other plant 
peroxidases, soybean seed coat peroxidase exhibits from 60 to 65% identity with 
poplar (D30653 and D30652, Osakabe et al 1994)) and flax (L0554, Omann and Tyson 
1995); 50 to 60% identity with horseradish (M37156, Fujiyama et al. 1988), tobacco 
(D11396, Osakabe et al 1993), and cucumber (M91373, Rasmussen et al. 1992); and 

20 49% identity with barley (L36093, Scott-Craig et al. 1994), wheat (X85228, Baga et 
al 1995) and tobacco (L02124, Diaz-De-Leon et al 1993) peroxidases. 


\ 

\ 
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A comparison of the promoter region, 1-1532 of SEQ ID N0:2, indicates that 
there are no similar sequences present within the GENBANK database. 

Example 2: DNA Blot Analysis Using the Seed Coat Peroxidase cDNA Probe 

Reveals Restriction Fragment Length Polymorphisms Between EpEp and epep 
Genotypes 

Genomic DNA blots of OX347 (EpEp) and OX312 (epep) plants were 
hybridized with ^^P-labelled cDNA to estimate the copy number of the seed coat 
peroxidase gene and to determine if this locus is polymorphic between the two 
genotypes. Figure 4 shows the hybridization patterns after digestion with BamUl, Xbal, 
and Sad. Restriction fragment length polymorphisms are clearly visible in the BamHl 
and Sad digestions. The BamUl digestion produced a strongly hybridizing 17 kb 
fragment and a faint 3.4 kb fragment in the EpEp genotype. The 3.4 kb BamUl 
fragment is visible in the epep genotype but the 17 kb fragment has been replaced by 
a signal at > 20 kb. The Sad digestion resulted in detection of three fragments in EpEp 
and epep plants. At least two fragments were expected here since the cDNA sequence 
has a Sad site within the open reading frame. However, the smallest and most strongly 
hybridizing of these fragments is 5.2 kb in EpEp plants and 4.9 kb in epep plants. 
Digestion with Xbal produced hybridizing fragments of -14 kb and 7.8 kb for both 
genotypes, with the larger fragment showing a stronger signal. 


Example 3: 
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A Deletion Mutation Occurs in the Recessive ep Locus 


The structural gene encoding the seed coat peroxidase is schematically 
illustrated in Figure 5. The 17 kb BamUl fragment encompassing the gene includes 191 
bp of sequence upstream from the translation start codon, three introns of 631 bp, 1030 
bp, and 263 bp, and 13 kb of sequence downstream from the polyadenylation site. The 
arrangement of four exons and three introns and the placement of introns within the 
sequence is similar to that described for other plant peroxidases (Simon, 1992; Osakabe 
et al. 1995). 


1 0 Primers were designed from the DNA sequence to compare EpEp and epep 

genotypes by PGR analysis. Figure 6 shows PGR amplification products from four 
different primer combinations using OX312 {epep) and OX347 (EpEp) genomic DNA 
as template. The primer annealing site for prx29+ begins 182 bp upstream from the 
ATG start codon; the remaining primer sites are shown in Figure 1 . Amplification with 

15 primers prx2+ and prx6-, and with prxl2+ and prxlO- produced the expected 
products of 1.9 kb and 860 bp, respectively, regardless of the Epiep genotype of the 
template DNA. However, PGR amplification with primers prx9+ and prxlO-, and with 
prx29+ and prxlO- generated the expected products only when template DNA was 
from plants carrying the dominant Ep allele. When template DNA was from an epep 

20 genotype, no product was detected using primers prx9+ and prxlO- and a smaller 
product was amplified with primers prx29+ and prxlO-. The products resulting from 
amplification of OX312 or OX347 template DNA with primers prx29+ and prxlO- 
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were directly sequenced and compared. The polymorphism is due to an 87 bp deletion 
occurring within this DNA fragment in OX312 plants, as shown in Figure 5. This 
deletion begins nine bp upstream from the translation start codon and includes 78 bp 
of sequence at the 5' end of the open reading frame, including the prx9+ primer 
annealing site. 


To test whether this deletion mutation cosegregates with the seed coat 
peroxidase phenotype, genomic DNA from an Fj population segregating at the Ep locus 
was amplified using primers prx9+ and prxlO- and Fj seed was tested for seed coat 
peroxidase activity. Figure 7 shows the results from this analysis. Of the 30 Fj 
individuals tested, all 23 that were high in seed coat peroxidase activity produced the 
expected 860 bp PGR amplification product. The remaining seven Fj's with low seed 
coat peroxidase activity produced no detectable PGR amplification products. 


Finally, to determine if the OX3l2{epep) and OX341(,EpEp) breeding lines are 
representative of soybean cultivars that differ in seed coat peroxidase activity, several 
cultivars were tested by PGR analysis using primer combinations targeted to the Ep 
locus. Figure 8 shows results from this analysis of six different soybean cultivars, three 
each of the homozygous dominant EpEp and recessive epep genotypes. As observed 
with OX312 and OX347, amplification products of the expected size were produced 
with primers prxl2+ and prxlO- regardless of the genotype, whereas epep genotypes 
yielded no product with primers prx9+ and prxlO- or a smaller fragment with primers 
prx29+ and prxlO-. 
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Example 4 Developmental Pattern of Expression of the Ep gene 


The seed coat peroxidase mRNA levels were determined by hybridizing RNA gel blots 
, with radio labelled cDNA probe. The figure illustrates the transcript abundance in 
5 various tissues of epep and EpEp plants. The mRNA accumulated to high levels in seed 
coat tissues of EpEp plants, especially in the later stages development when whole seed 
fresh weight exceeded 50 mg. Low levels of transcript could also be detected in root 
tissues but not in the flower, embryo, pod or leaf. The transcript could also be detected 


m 


seed coat and root tissues epep plants but in drastically reduced amounts compared 


1 0 to the EpEp genotype. The reduced amounts of peroxidase mRNA present in seed coats 
of epep plants indicates that the transcriptional process and/or the stability of the 
resulting mRNA is severely affected. The Ep gene has a TATA box and a 5' cap signal 
beginning 47 bp and 15 bp, respectively, upstream from the translation start codon. 
The 87 bp deletion in the ep allele extends into the 5 ' cap signal and therefore could 

15 interfere with transcript processing. Regardless, any resulting transcript will not be 
properly translated since the AUG initiation codon and the entire amino-terminal 
signal sequence is deleted from the ep allele. Not wishing to be bound by theory, the 
lack of peroxidase accumulation in seed coats of epep plants appears to be due to at 
least two factors, greatly reduced transcript levels and ineffective translation, resulting 

20 from mutation of the structural gene encoding the enzyme. In summary, the results 
indicate that the Ep gene regulatory elements can drive high level expression in a 
tightly coordinated, tissue and developmentally specific manner. 
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All scientific publications and patent documents are incorporated herein by 
reference. 


The present invention has been described with regard to preferred 
embodiments. However, it will be obvious to persons skilled in the art that a number 
of variations and modifications can be made without departing from the scope of the 
invention as described in the following claims . 
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